Evolution of risk attitudes in the population

نویسنده

  • Erdem Pulcu
چکیده

We are living in an uncertain and dynamically changing world. Under uncertainty, perception of outcome probabilities is an important factor for value-based decision-making and it is directly linked to the survival of species. However, evolutionary selection pressures shaping such behavioural traits received almost no empirical attention. Here, we demonstrate that fitness associated with different probability weighting preferences are influenced by value properties of the environment; as well as the characteristics and the density of competitors in the population. Although maintaining an unbiased perception of outcome probabilities may be regarded as optimal; in volatile environments, it cannot recover from a marked population density disadvantage. Overweighting outcome probabilities, similar to having an optimism bias for rewarding outcomes, have better fitness when there is large number of competing strategies in the population. In tandem, these results may be important for understanding probability weighting preferences in terms of an environmental adaptation. Introduction: We are living in an uncertain and an ever-changing world, where our decisions are guided by our expectations of their outcomes. Optimal decision-making under uncertainty is a common problem faced by all biological entities in higher classes of the animal taxonomy, and therefore it is crucial for the survival of species. The ways in which we perceive probabilities associated with desirable or aversive outcomes is an important factor in shaping our expectations, and a key tenet of modelling [reinforcement] learning and value-based decision-making processes (Behrens, Woolrich et al. 2007). Consequently, decision-making under uncertainty has been studied extensively in economics (Kahneman and Tversky 1979, Tversky and Fox 1995, Prelec 1998), as well as in behavioural and neural sciences (Hsu, Bhatt et al. 2005, Tobler, O'Doherty et al. 2007, Hsu, Krajbich et al. 2009, Hunt, Kolling et al. 2012); aiming to understand how the brain extracts relevant information from the environment to resolve uncertainty, in order to make optimal decisions between available options. Although theories of value-based decision-making are continuously expanding to account for various non-normative aspects of human behaviour observed in field experiments, historically two theories have been particularly influential: Expected Utility Theory (Mongin 1997, Dhami 2016) and the Prospect Theory (Kahneman and Tversky 2013). Prospect Theory is often regarded as an advancement over the Expected Utility Theory as it accounts for non-linear perception of outcome probabilities, a sub-optimal aspect of value-based decision-making, also important in understanding the resolution of Allais paradox (Allais 1990). Despite their significance, the impact of evolutionary pressures shaping behavioural traits such as that risk perception and non-linear probability weighting which govern value-based decision-making, so far received almost no empirical attention (Sinn 2003, Santos and Rosati 2015), particularly scarcely in comparison to game theoretic (Smith 1993, Von Neumann and Morgenstern 2007, Camerer 2010) and interactive processes such as interpersonal cooperation (Nowak and Sigmund 1993, Axelrod 1997) or altruistic punishment (Boyd, Gintis et al. 2003). The current manuscript addresses this knowledge gap by bridging stochastic choice and stochastic population models in an evolutionary framework; providing quantitative analyses of fitness trajectories associated with different probability weighting preferences under risk neutrality, competing in dynamically changing value environments. The macroscopic approach presented here is important not only because the global financial markets, where millions of traders interact every day, remain just as volatile as the physical environment of the Prehistoric times; but also, to understand the evolutionary biological roots of probability weighting preferences in the population. Results Optimal strategy in the deterministic choice model In order to study how evolution might have shaped attitudes to probability weighting in the population, we conducted a series of simulated decision-making experiments (Fig. 1A), in which there were 4 types of agents defined in terms of their probability weighting preferences, namely: unbiased, probability overweighting, probability underweighting and S-shaped (see Supplemental Materials and Methods (SMM) for mathematical definitions; and Fig. 1B for their graphical expression); competing in a virtual environment containing 1 million randomly generated options, where rewards are delivered probabilistically (fig. S1). As one might expect, when the agents made decisions in isolation, the strategy with unbiased probability weighting acquired more resources in an accumulated fashion relative to the other strategies when the expected value difference between the options ( v  ) in the environment varied randomly in a wide (Fig. 1C), or within a limited range ( 5 v   ; Fig. 1D). However, these initial simulations in which the agents made decisions in isolation also assumed that agents’ choices are deterministic, such that decision-makers are hypersensitive even to the subtlest changes in the expected value difference between the options, which is not warranted given numerous reallife behavioural economic experiments showing some degree of stochasticity in people's choices (Hsu, Krajbich et al. 2009). Optimal strategy in the stochastic choice model In mathematical models of decision-making, the degree of stochasticity is defined by an inverse temperature term ( ) adopted from thermodynamics (also see SMM). Assigning a moderate value to the coefficient, which modulates the subjective value difference between the options ( v  ) in a softmax function (which in return generates the choice probabilities of each of the available options (Daw 2011)): ( ( )) 1/ (1 exp ) v L q      (1) suggests that stochasticity will particularly have a negative effect on the performance of strategies with an element of probability underweighting (Fig. 1E). Furthermore, by using the unbiased strategy as a behavioural template, we demonstrate that increasing values of the  coefficient quickly saturates the magnitude of accumulated rewards (Fig. 1F), potentially indicating the upper boundary of its evolution in the population as a behavioural trait. Optimal strategy in the stochastic population model Following this straightforward, but rather necessary introduction, we progress with a population level of analysis by duplicating the agents from the first stage to build up a mixed, model society (N=4x104) in which each of the 4 different probability weighting strategies occupied an equal population density. Here, we linked the individual stochastic-choice model with an evolutionary dynamic computational model (i.e. stochastic population model) at the expected value ( v  ) and the choice probability ( L q ) levels (see SMM for the full mathematical description of the models); making it possible to compute the expected random fitness ( A F ) for any of these aforementioned strategies competing against each other to acquire rewards. We created volatile value environments by segmenting the original 1 million gambles into 10,000 evolutionary time courses each running for 100 generations, where the expected value difference between the options changed randomly from one generation to the next. The reward magnitudes in each probabilistic gamble corresponded to the amount of resources which can be acquired from the physical environment during the course of a single generation on the simulation timeline (Fig. 2A). Once the expected random fitness ( A F ) of the competing strategies are computed, it is straightforward to model the local process of the co-evolution by natural selection, as previously proposed by Traulsen et al (Traulsen, Claussen et al. 2005). Natural selection is implemented in terms of bidirectional transition rates between the groups ( A B r  ) from one generation to the next and these are proportional to between-group differences in expected random fitness. In contrast to the results of the individual choice models where agents make decisions in isolation, this methodology reveals that increasing values of the  coefficient enhances the performance of the unbiased strategy whereby it acquires higher population density at time (t) = 100 (the time point where each simulation ended (Fig. 2B~F)), as well as improving the overall fitness of the population (Fig. 2G). Here, it is important to point out that across all the simulations using the stochastic population model reported in this manuscript, the upper boundary of the  coefficient was set to 2.6 which was previously reported by Hsu et al.(Hsu, Krajbich et al. 2009) in the context of value-based decision-making. Is unbiased probability weighting an evolutionarily stable strategy for value-based decisionmaking? Although the competition in the population is shown to be the strongest when  = 0.2, the unbiased strategy appeared to be the most optimal strategy overall. From an evolutionary fitness point of view, it is noteworthy that a previous behavioural study reported parameter values for a 2parameter probability weighting function which would also correspond to unbiased probability weighting in the gains domain (Stott 2006). However, unlike behavioural studies which focus on participants’ choices in isolation, our population level of analyses show that the performance of the unbiased strategy also depends on the degree and the characteristics of the volatility in the environment (Fig. 3A~C). Here, by focusing on the first 20 generations where a clear separation occurs between the trajectories of the successful strategies, we considered three measures to quantify the environmental volatility: magnitude change in the expected value of the environment from one generation to the next; the frequency of the change in the sign of expected value difference (from to +, or vice versa); and how gradually the expected value difference changed in the environment by checking the correlation coefficient between a vector containing the number of generations and a vector containing expected value differences. Here, a highly positive or a highly negative value for the correlation coefficient would mean that environment, although volatile, is changing relatively more gradually from one generation to the next (eg. similar to bear or bull markets). Across 10,000 simulation environments the unbiased strategy dominated the population 53.66% of the time, overweighting strategy 37.58% of the time and the Sshaped strategy 8.76% of the time. It is important to note that across 10,000 different simulation environments, the underweighting strategy is consistently driven to extinction (Fig. 2 B~G). Subsequent analysis suggested that S-shaped strategy prevailed in environments in which the average magnitude of the change in expected value from one generation to the next was highest, and unbiased strategy when it is lowest (see fig. S2; F2, 9999=20.43, p<0.001, Bonferroni corrected). The environments where different strategies eventually dominated the population, were comparable with respect to other metrics of volatility (all F2, 9999 < 0.809; all p> 0.445). We further decomposed the local process of the co-evolution for each of these successful strategies in Fig. 3D~I, demonstrating the divergence of strategies with respect to value properties of the environment. The overweighting strategy had better fitness relative to other competitors when the expected value difference in the environment varied in a narrow range ( 5 v   ) and  =0.2 (see fig. S3). In the subsequent step, we investigated how the unbiased strategy would perform against other transient competitors, those which fall outside of our predetermined/categorical strategies (e.g. one which consistently overweighs probabilities). In order to address this question in an unbiased way, we preserved the normalised population density of all the strategies at t = 0 ( A N = 0.25). First, we varied the values of the and the coefficients in two probability weighting functions and the sigmoid function simultaneously (see SMM; Eq. 7 and 4, respectively) on a 30x30 numerical grid (i.e. the parameter space) with values ranging from 0 to 3 on each axis. Here, varying the values of the coefficient produced strategies with different probability weighting preferences against the unbiased strategy (see SMM for mathematical definitions and fig. S4A for a graphical expression of the probability weighting functions); whereas varying the values of the  coefficient from 0 to 3 covered all possible degrees of stochasticity in agents’ choice probabilities within this range. The changing values of the  coefficient affected the choice probabilities of all strategies equally; also including the unbiased strategy. Then, the simulations described above were repeated in the same 10,000 volatile environments to cover all possible combinations of the and the coefficients in this 30x30 parameter space (thus in total performing 9x107 simulations). The average normalised population density of the unbiased strategy at each intersection was then converted to a heat map (fig. S4B). This investigation showed that the performance of the unbiased strategy is sensitive to the changes in the characteristics of the competitors, and there are regions in the parameter space where the unbiased strategy loses prevalence from its initial population density, suggesting that absolute accuracy with which outcome probabilities are weighted in value-based decision-making cannot be an evolutionarily stable strategy (Smith 1982), providing additional support for the preceding analysis described above (Fig. 3A~C). Could unbiased probability weighting preference emerge from a minority in the population? After performing a visual inspection, we selected the intersection of the parameters at  = 0.8276 and  = 1.0345 as a reference point for the next stage of investigations (fig. S4B), aimed at understanding whether the unbiased strategy could have evolved from a small minority in the population. Although this selection is arbitrary (considering the full size of the parameter space, such an informed selection is necessary for progressive analysis), it serves as a suitable reference point as at this intersection of the and the coefficients, the unbiased strategy occupied 51.63% of the population (i.e. average value across 10,000 environments) at the end of the simulation time course despite initially occupying 25% of the population at t = 0. This outcome suggests that at this intersection of the parameters, the unbiased strategy still remains as the most successful strategy in terms of evolutionary fitness, occupying the majority of the population (>50%). Next, we fixed the and the coefficients to their values at this intersection and repeated these simulations by assigning a different normalised population density to the unbiased strategy (values gradually increasing from 0.01 to 0.25) at t = 0, with the rest of the three competitors always occupying the remaining of the population with equal population density. Here, we show that in volatile environments the unbiased strategy cannot recover from a population density disadvantage unless this is almost negligible (Fig. 4A). In order to understand how the fitness of the unbiased strategy varies in relation to the changes in the characteristics of its competitors, we further zoomed in on to only one of the outcomes of these set of simulations where it lost prevalence from its initial population density, but remained in the population at the end of the simulation time course in a competitive manner (Fig. 4B). Again, although this selection is arbitrary, it serves as a useful reference point to understand the nature of these competitive interactions; as the unbiased strategy losing prevalence from its initial population density while not being extinct, suggests that there must be a strong competition between the strategies present in the population. It is important to point out that this informed selection is somewhat necessary due to a high number of output combinations resulting from the interaction between different ratios by which initial population densities can be set and the values of the and the coefficients (i.e. the grid size of the parameter space multiplied by all possible values which can be assigned to the initial population density). Under these constraints, we showed that the landscape representing the outcome of these evolutionary competitions is highly diverse (see Fig 4B for a 3-dimensional graphical representation), suggesting that subtle changes in the environmental conditions (including the characteristics of the competitors) may produce quite distinct evolutionary outcomes for the unbiased strategy, highlighting a degree of randomness in these evolutionary processes occurring within the population (Fig. 4B-C). It is important to clarify that; this is due to the degree of stochasticity in agents’ choice probabilities (i.e. the values of the  coefficient) interacting differently with the competitors’ probability weighting preferences (i.e. the values of the coefficient) influencing the overall fitness of the agents with unbiased probability weighting (Fig. 4B). Co-evolution of probability weighting preferences in populations with inherent variability At the final step, we wanted to investigate the evolution of probability weighting preferences in a population with a large degree of inherent variability. This scenario is based on a general assumption that behavioural characteristics with large variability (Hunt 2014) observed in the population emerge and carry on existing concurrently, suggesting these may have coevolutionary dynamics. We constructed such an environment by assigning different values to the coefficient in both 2-parameter and log2 probability weighting functions, from the parameter space described above. Consequently, in this investigation the population contained 61 different strategies with equal population density at (t) = 0: namely, the unbiased strategy; 30 strategies constructed by assigning different values to the coefficient in the 2-parameter; and 30 strategies constructed by assigning different values to the coefficient in the log2 probability weighting functions. We fixed the value of the coefficient in the stochastic choice model to 0.2, as we previously established that this assignment maximised the competition between the strategies (Fig. 2B); so that more information about the nature of evolutionary competitions between coexisting strategies could be obtained. Across 10,000 simulations the unbiased strategy did not come up as the most optimal strategy in terms of its evolutionary fitness, and it performed relatively poorly compared with a strategy which overweighs probabilities (Fig 4D~F). Additionally, we showed that the strategies which subjectively augmented the likelihood of low probability outcomes and attenuated the likelihood of high probability outcomes performed the worst (ie. inverse-S-shaped; after excluding the strategy which makes decisions randomly). Discussion The present results demonstrate that fitness associated with different probability weighting preferences are influenced by value properties of the environment, agents’ choice stochasticity as well as the characteristics and the density of competitors in the population. In volatile value environments analogous to bull and bear markets (Gonzalez, Powell et al. 2005), reward probability weighting following the trajectory of log2 functional form could be the most competitive strategy (Fig. 3). Although maintaining an unbiased perception of outcome probabilities may be regarded as optimal otherwise; quantitative evidence suggests that in volatile environments, it cannot recover from a marked population density disadvantage (Fig 4A-B). Finally, the simulations show that a strategy which over weights outcome probabilities, similar to having an optimistic outlook over the full reward probability spectrum, may emerge as the best strategy when there is large number of competing strategies in the population (Fig 4D) and expected value difference between the options in the environment change in a narrow range (fig S3). Potential implications for behavioural ecology The macroscopic/evolutionary approach which is presented here may provide valuable insights for behavioural ecology. Formulating population models of probability weighting preferences is critical for a canonical understanding of decision-making processes in predator-prey encounters (Lima 2002, Hebblewhite, Merrill et al. 2005), during foraging considerations (Orrock, Danielson et al. 2004, Higginson, Fawcett et al. 2012) and the trade-offs between them (Hebblewhite and Merrill 2009); all related to evolutionary fitness and natural selection of species. Laboratory studies which could inform development of these population models also suggest that higher order primates are capable of tracking probabilities associated with pleasant as well as undesirable outcomes (Lakshminarayanan, Chen et al. 2011); with probabilities associated with rewards being encoded in the midbrain dopaminergic (Fiorillo, Tobler et al. 2003) and the posterior cingulate neurons (McCoy and Platt 2005). Additionally, there is accumulating evidence in favour of variability in value-based decision-making across species. For example, bonobos (Heilbronner, Rosati et al. 2008) and lemurs (MacLean, Mandalaywala et al. 2012) show a preference for risk aversion, but rodents (Adriani and Laviola 2006) and macaques (McCoy and Platt 2005, Hayden and Platt 2007) have preference for risky options. A recent study suggests that computations underlying value-based decision-making in animals also utilise similar functions with that of humans which have nonlinear properties (Stauffer, Lak et al. 2015). Although it is uncertain how well these laboratory findings could represent computations underlying value-based decision-making in the wild (Paglieri, Addessi et al. 2014), one prediction of our model is that when species are competing to acquire resources in a finite and volatile environment, species which systematically under weigh probabilities could eventually be selected against. Understanding preference for underweighting probabilities Inevitably, this prediction would raise a question about the prevalence of underweighting probabilities in the population (Cohn, Lewellen et al. 1975, Kahneman and Tversky 1979). A considerable number of studies have reported probability weighting preferences with nonlinear properties, an overweighting tendency for probabilities approximately lower than 0.34, but a marked underweighting for probabilities exceeding this threshold (Stott 2006, Hsu, Krajbich et al. 2009, Tanaka, Camerer et al. 2010) (also see SMM and fig. S5). On the other hand, the studies which used a probability weighting function similar to the log2 functional form reported here, focused on how people made value-based decision while learning the hidden probabilities associated with rewards or punishments by predictive sampling from the environment (Behrens, Woolrich et al. 2007, Suzuki, Harasawa et al. 2012, Browning, Behrens et al. 2015). Arguably, these latter experimental designs have higher ecological validity in terms of understanding probability weighting preferences in the population in real life financial decisionmaking situations, considering that decision-makers do not always have full access to decision variables necessary for computing the expected value difference between the options they face. The probability weighting functions reported by these studies also have nonlinear properties to account for their subjective modulation, but unlike the previous studies mentioned at the beginning, their functional form was mainly expressed in terms of an underweighting tendency for probabilities lower than 0.5 and an overweighting tendency beyond this cut-off point (Behrens, Woolrich et al. 2007, Suzuki, Harasawa et al. 2012). In the current work, we provide complementary evidence showing that under favourable conditions (Fig. 3C), individuals who utilise a similar probability weighting function to guide their valuebased decisions in volatile environments/markets will be the most competitive agents in terms of evolutionary fitness, particularly in environments where the volatility is high in terms of the magnitude of change in resources from one generation to the next (fig. S2). Taken together, log2 functional form may be another suitable candidate to represent hardwired probability weighting preferences in humans for everyday financial decisions. Potential implications for financial markets From a complementary perspective, the findings we present here may have important implications for understanding traders’ decisions in global financial markets, also including those which involve cryptocurrency exchanges, which exhibit similar volatile characteristic with that of our simulated environments. For example, it is frequently debated whether risky decisions are among the triggering causes of global financial crises (Rajan 2005) which seem to have shortening cycles. The current investigation shows that overweighting reward outcome probabilities could be understood in terms of an evolutionary adaptation among the traders who actualise transactions in financial environments which are both highly competitive and volatile. In this respect, the present results complement the findings of two recent behavioural studies which showed evidence to suggest that observing others’ value-based decisions could influence one’s own preferences in the same direction (Chung, Christopoulos et al. 2015, Suzuki, Jensen et al. 2016). Taken together, these individual and population level mechanisms may lead to spread of optimistic expectations for probabilistic rewards in competitive financial markets. However, as stated previously, unlike our simulated environments in which the competing agents had full access to the decision variables, the risks associated with returns are mostly hidden in real life and traders have to learn to estimate these from imperfect information (Mishkin 1992, Behrens, Woolrich et al. 2007, Suzuki, Harasawa et al. 2012). The present results suggest that, under these circumstances, strategies which overweigh reward probabilities would lose their fitness advantage (fig. S6) and such deviations from neutrality could increase the stress on global economy until it is discharged in the form of a financial crisis, which happened twice in the last decade. Potential implications for understanding behavioural pathologies Finally, our macroscopic approach could also inform the evolutionary perspective on psychopathology (Baron-Cohen 2013), which posits that clinically debilitating conditions may be associated with fitness advantages. We propose that strategies which over weigh reward probabilities which could be highly adaptive in certain conditions, may contribute to a hardwired, biological vulnerability feature for psychiatric disorders associated with risk and sensation seeking behaviours; such as pathological gambling (Clark 2010).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effectiveness of Emotion Regulation Training and Coping Therapy in Mode and Attitudes to Narcotics among the Adolescents at Risk of Substance Abuse

Objective: The aim of this study was to determine the effectiveness of emotion regulation training and coping therapy in mode and attitudes to narcotics among the adolescents at risk of substance abuse. Method: The present study employed a quasi-experimental research design along with two experimental and control groups. The statistical population of this study included all male secondary schoo...

متن کامل

Theoretical Explanation of the Use of Cyberspace and the Evolution of Family Structure in Iran with Emphasis on the Concept of Generation Gap

The family is the vital source of peace and comfort, love and intimacy. But the family can also be a place of conflict, difference, gap and distance in terms of values ​​and patterns of behavior between children and parents. Virtual social networks are a new generation of social networking space that at the end of the first decade of the 21st century have changed the ways of communic...

متن کامل

Cognitive Evolution of the “Human” Concept and Its Adaptation to Piaget’s Theory

Background: Cognitions and attitudes, especially anthropological attitudes, are influential in human behavior. Objectives: The present study was conducted to investigate the cognitive evolution of the human concept in elementary school female students and its adaptation to Piaget’s theory of cognitive development. Materials & Methods: The present research method is qualitative of deductive-ind...

متن کامل

From Traditional to Digital Environment: An Analysis of the Evolution of Business Models and New Marketing Strategies

This paper analyzes the major trends in the business environment that shaped the business models adopted by companies and their new marketing strategies. It adopts a desktop research methodology by collecting data from previous academic papers, statistical, and analytical reports. It starts by analyzing the globalization trend that forced most of the emerging economies to liberalize and privati...

متن کامل

The evolution of the thoughts of protesters to hijab from the Naserid era to the second Pahlavi period (1227-1320)

Following the confrontation of modern thinkers with modernity, the votes of the hijab protestors were raised as preparation of intellectualism and modernity requirements. The present article by using a documentary-analytical method investigates the development of the hijab critics’ theories during the 1227 to 1320 period by describing their beliefs and performances to explain the influential f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017